KNN Regression as Geo-Imputation Method for Spatio-Temporal Wind Data

نویسندگان

  • Jendrik Poloczek
  • Nils André Treiber
  • Oliver Kramer
چکیده

The shift from traditional energy systems to distributed systems of energy suppliers and consumers and the power volatileness in renewable energy imply the need for e↵ective short-term prediction models. These machine learning models are based on measured sensor information. In practice, sensors might fail for several reasons. The prediction models cannot naturally cannot work properly with incomplete patterns. If the imputation method, which completes the missing data, is not appropriately chosen, a bias may be introduced. The objective of this work is to propose the k-nearest neighbor (kNN) regression as geo-imputation preprocessing step for pattern-labelbased short-term wind prediction of spatio-temporal wind data sets. The approach is compared to three other methods. The evaluation is based on four turbines with neighbors of the NREL Western Wind Data Set and the values are missing uniformly distributed. The results show that kNN regression is the most superior method for imputation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

KNN Classification and Regression using SAS

K-Nearest Neighbor (KNN) classification and regression are two widely used analytic methods in predictive modeling and data mining fields. They provide a way to model highly nonlinear decision boundaries, and to fulfill many other analytical tasks such as missing value imputation, local smoothing, etc. In this paper, we discuss ways in SAS R © to conduct KNN classification and KNN Regression. S...

متن کامل

Classification of Efficient Imputation Method for Analyzing Missing Values

In Statistical analysis, missing data is a common problem for data quality. Many real datasets have missing data. Imputation preserves all cases by replacing missing data with a probable value based on other available information. Once all missing values have been imputed, the data set can be analyzed using standard techniques for complete data. This paper aim is to describe the efficient imput...

متن کامل

Evaluation of Missing Value Estimation for Microarray Data

Microarray gene expression data contains missing values (MVs). However, some methods for downstream analyses, including some prediction tools, require a complete expression data matrix. Current methods for estimating the MVs include sample mean and K-nearest neighbors (KNN). Whether the accuracy of estimation (imputation) methods depends on the actual gene expression has not been thoroughly inv...

متن کامل

Spatio-temporal analysis of diurnal air temperature parameterization in Weather Stations over Iran

     Diurnal air temperature modeling is a beneficial experimental and mathematical approach which can be used in many fields related to Geosciences. The modeling and spatio-temporal analysis of air Diurnal Temperature Cycle (DTC) was conducted using data obtained from 105 synoptic stations in Iran during the years 2013-2014 for the first time; the key variable for controlling the cosine term i...

متن کامل

Combining kNN Imputation and Bootstrap Calibrated: Empirical Likelihood for Incomplete Data Analysis

The k-nearest neighbor (kNN) imputation, as one of the most important research topics in incomplete data discovery, has been developed with great successes on industrial data. However, it is difficult to obtain a mathematical valid and simple procedure to construct confidence intervals for evaluating the imputed data. This chapter studies a new estimation for missing (or incomplete) data that i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014